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1.  INTRODUCTION 


The  central  theme  of  this  research  project  is  recovery  of  object  shapes  from  noisy  images. 
Immediately,  there  is  the  question  of  what  is  a  shape  and  how  is  it  to  be  represented.  Since 
answers  to  these  questions  have  to  be  ultimately  tailored  to  the  uses  one  has  in  mind,  one  has  to 
bring  into  consideration  potential  applications  and  with  it,  the  question  of  practical  algorithms  for 
implementation  of  the  theory.  This  research  project  is  concerned  with  mainly  two-dimensional 
shapes.  Mathematically,  a  shape  is  the  boundary'  of  an  object  which  is  simply  an  open  subset 
in  the  image  domain,  characterized  in  some  way.  In  the  real  world,  what  is  an  object  and  what 
is  just  noise  or  clutter  depends  of  course  on  what  one  is  looking  for.  For  example,  Figure  la 
shows  a  noiseless  synthetic  image.  It  is  reasonable  to  assume  that  the  objects  in  the  figure  are 
the  four  squares  in  the  four  comers  and  the  two  ellipses  in  the  middle.  Figure  lb  shows  a  noisy 
version  obtained  from  the  image  in  Figure  la  by  adding  Gaussian  noise.  The  signal-to-noise  ratio 
(i.e.  the  ratio  between  the  standard  deviation  of  the  image  with  noise  removed  and  the  standard 
deviation  of  the  noise)  is  1:4.  The  problem  is  to  recover  the  original  objects. 


(a)  NOISE-FREE  IMAGE  (b)  NOISY  VERSION  OF  THE  SAME  IMAGE 

FIGURE  1 


The  long  term  goal  of  this  research  project  is  shape  analysis  and  object  recognition.  A 
common  approach  to  the  problem  of  shape  representation  and  shape  analysis  is  to  assume  that 
the  shape  boundary  has  already  been  found  and  the  problem  is  to  extract  its  salient  features 
for  subsequent  use  in  object  recognition.  There  are  several  approaches  to  this.  One  that  is 
appropriate  for  characterizing  shapes  of  objects  subject  to  a  group  of  transformations  is  to  calculate 
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some  of  the  group  invariants.  The  groups  that  are  usually  considered  are  subgroups  of  the 
group  of  projective  transformations.  This  approach  is  not  appropriate  when  homotopical  or  even 
topological  alterations  of  shapes  must  be  taken  into  account.  Other  approaches  include  finding 
differential  invariants  such  as  the  points  of  maximum  curvature  or  inflection  and  representation 
of  shapes  by  shape  skeletons  such  as  the  medial  axis  transforms  [PBCFM],  Yet  another  approach 
is  to  describe  a  shape  in  terms  of  its  parts  such  as  necks  and  lobes.  The  crucial  question  from 
the  point  of  view  of  this  research  project  is  how  to  recover  these  descriptions  from  noisy  images 
in  a  robust  way. 

The  main  mathematical  techniques  for  achieving  the  goals  of  this  research  project  are  energy 
functionals  and  gradient  descent.  The  general  approach  is  the  construction  of  vertically  integrated 
models  which  incorporate  constraints  imposed  by  various  objectives  such  as  noise  suppression, 
boundary  detection,  shape  description  and  object  matching.  The  motivation  for  building  integrated 
models  comes  from  a  long  history  of  attempts  to  isolate,  formulate  and  solve  simpler  problems 
in  computer  vision  and  the  realization  that  no  matter  how  sophisticated  a  technique  is,  there 
are  inherent  ambiguities  which  cannot  be  resolved  in  isolation.  It  is  essential  to  incorporate 
contextual  information  in  the  model.  Moreover,  these  methods  are  ideal  for  formulating  models 
based  on  raw  intensity  images.  This  is  in  contrast  to  the  earlier  hierachical  approaches  in  which 
features  are  extracted  first  and  then  these  “feature  maps”  are  processed  for  higher  level  tasks.  The 
trouble  is  that  the  features  cannot  be  identified  with  full  accuracy  without  bringing  in  some  higher 
level  knowledge,  and  thus  incorrect  interpretations  may  be  assigned  during  'he  construction  of 
the  feature  map.  By  incorporating  the  raw  image  in  the  model,  one  can  ensure  that  the  input 
to  higher  level  tasks  such  as  object  recognition  reflects  realistic  ambiguities  and  noise,  and  thus 
ensure  robustness  of  the  solution.  At  the  same  time,  integration  provides  further  constraints 
which  may  help  resolve  ambiguities  in  the  image.  The  disadvantage  of  such  an  approach  is  that 
as  more  and  more  realistic  representations  are  incorporated  in  the  energy  functionals,  they  become 
more  and  more  difficult  to  analyze  and  implement.  Efforts  to  develop  fast  stochastic  algorithms 
have  so  far  not  been  very  successful.  An  alternative  approach  is  to  construct  approximations  of 
these  complex  functionals,  even  deliberately  weakening  the  coupling  among  their  components 
and  attempt  to  find  stationary  solutions  by  gradient  descent. 

2.  SEGMENTATION  PROBLEM 
2.1  Segmentation  Functional 

Traditionally,  the  first  step  in  image  understanding  is  the  detection  of  edges.  The  usual 
method  is  to  smooth  the  image  to  reduce  noise  and  then  apply  some  kind  of  local  edge  operator 
such  as  the  zero-crossings  of  the  laplacian.  Geman  and  Geman  [GeGe]  proposed  a  Bayesian 
formulation  for  simultaneous  smoothing  and  boundary  detection.  At  about  the  same  time,  Blake 
and  Zisserman  [BZi,BZ2]  proposed  an  analogous  discrete  model.  An  analytic  version  of  these 
models  is  formulated  in  [MSi  ]  as  follows: 

E(u,B )  =  J  (u-g)2dxdy+  J  \\Vu\\2dxdy  +  u\B\  (1) 

R  R\B 
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where 


R  is  the  image  domain; 

g  is  the  feature  intensity;  g  \  R  —  R; 

D  is  the  union  of  segment  boundaries;  thus  B  is  the  segmenting  curve; 

\B\  is  the  length  of  B\ 

a  and  v  are  the  weights,  (a  may  be  thought  of  as  the  smoothing  radius  in  R\B.) 

The  task  is  to  find  u  and  B  which  minimize  E(u.B).  Thus  the  segmentation  problem  is  viewed 
as  the  problem  of  finding  a  piecewise  smooth  approximation  of  g  with  minimal  amount  of 
discontinuity.  The  weights  a  and  v  control  the  relative  degrees  to  which  smoothness  of  u 
and  its  discontinuity  locus  are  emphasized.  Smoothing  is  explicitly  prevented  from  extending 
over  B.  An  object  in  this  formulation  is  characterized  by  having  relatively  uniform  feature 
intensity.  Many  important  theoretical  results  concerning  existence  and  regularity  have  been 
obtained  [Am,DMS,DCL,[MS2].Ri,Sh4,Wa].  In  particular,  in  [MS2]  it  is  shown  that  the  minimal 
length  constraint  implies  that  there  cannot  be  any  points  where  more  than  three  objects  meet  and 
an  object  cannot  have  comers  except  at  points  where  three  objects  meet.  At  such  comers,  the 
angle  is  always  120°.  Thus  the  formulation  implies  a  very  special  kind  of  shape  representation 
and  it  was  one  of  the  goals  of  this  research  project  to  investigate  ways  in  which  formulation  (1) 
may  be  altered  or  augmented  to  represent  shapes  more  realistically. 

2.2  Dirichlet  Version 

A  simple  way  to  preserve  the  actual  shape  singularities  is  to  impose  Dirichlet  boundary 
conditions  on  u  in  functional  (1),  namely,  u±  =  g±  along  B  where  the  superscript  ±  denote 
the  values  on  the  two  sides  of  B.  Results  concerning  the  existence  and  approximations  in  this 
case  are  proved  in  [Shg],  It  is  shown  that  singularities  of  u  are  then  a  subset  of  those  of  g;  thus 
distortion  of  comers  and  junctions  is  prevented.  The  limitation  of  this  modified  model  is  that 
if  the  actual  boundaries  are  noisy  (i.e.  locally  the  graph  of  a  noisy  function),  they  will  not  be 
smoothed.  Moreover,  if  the  boundaries  in  the  image  are  blurred  as  is  likely  in  practical  situations 
so  that  g  is  continuous,  then  B  would  be  empty.  Hence  the  formulation  must  always  be  used 
with  an  explicit  representation  for  blurring.  Such  a  representation  based  on  Ambrosio-Tortorelli 
approximation  (described  below)  is  formulated  and  analyzed  in  [Sh$].  Note  that  blurring  also 
provides  smoothing  of  noisy  boundaries.  Recovery  of  unblurred  boundaries  from  their  blurred 
version  is  treated  as  a  separate  problem. 

2.3  Extensions 

An  obvious  extension  for  enriching  shape  representation  in  functional  (1)  is  to  incorporate 
information  about  the  curvature  of  the  boundary.  One  such  extension  is  proposed  in  [NM]  where 
\B\  is  replaced  by 

J  (vo  +  v\K2)ds  (2) 

B 

where  «  denotes  the  curvature.  Very  few  theoretical  results  concerning  the  new  functional  have 
been  obtained.  Results  for  simple  closed  curves  based  on  elastica  for  functional  (2)  under  the 
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assumption  that  n0  =  0  and  that  the  length  of  the  curve  is  fixed  have  been  obtained  in  [M,W].  In 
[BDP],  lower  semi-continuity  of  functional  (2)  is  analyzed  and  certain  pathologies  that  arise  are 
exhibited.  For  instance,  even  for  a  simple  heart  shape,  minimizing  sequences  produce  regions  of 
zero  width  in  the  limit.  Practical  difficulties  in  implementing  such  a  functional  are  severe  and 
as  yet.  no  one  has  implemented  it.  Allowing  B  to  be  only  piecewise  continuous  or  introduction 
of  more  shape  information  such  as  illusory  contours  or  3D  interpretations  makes  the  functional 
even  more  intractable.  In  view  of  these  difficulties  which  increase  as  the  functionals  become 
more  complex,  it  was  considered  very  important  to  devote  a  major  part  of  this  research  effort  to 
the  construction  of  approximations  which  are  a  compromise  between  formuladons  expressed  by 
a  single  functional  on  one  hand  and  the  earlier  bottom-up  hierarchical  formulations  on  the  other. 

2.4  Elliptic  Approximation 

What  makes  the  task  of  minimization  of  functional  (1)  very  difficult  is  the  presence  of 
the  one-dimensional  segmenting  curve.  Ambrosio  and  Tortorelli  [ATi,AT2]  proposed  an  elliptic 
approximation  of  functional  (1)  so  that  gradient  descent  may  be  applied.  Their  method  is  as 
follows.  Let 

A„(t0  =  {pIF^II2  +  l—^dxdy  (3) 

R 

For  a  fixed  curve  B,  if  vp  minimizes  A„(v)  with  boundary  condition  v  =  1  on  B,  then  as  p  —  0, 
vp  obviously  tends  to  zero  everywhere  except  on  B  where  it  equals  one.  The  key  observation  is 
that  A p(v)  —  \B\  as  p  —  0.  Values  of  vp  range  between  0  and  1  and  thus  may  be  interpreted  as 
the  probability  for  the  presence  of  a  boundary  point.  Alternatively,  vp  may  be  viewed  as  blurring 
of  B  with  p  as  the  nominal  blurring  radius.  The  method  of  Ambrosio  and  Tortorelli  is  to  replace 
the  term  \B\  in  functional  (1)  by  Ap(v).  B  is  now  spread  out  over  all  of  R  and  we  have  to  modify 
the  integral  in  functional  (1)  as  well  since  we  no  longer  have  the  set  R\B.  A  simple  choice  is  to 

replace  J  ||Vu||2dx<it/  by  J  (1  -  v)2\\Vu\\2dxdy  (4) 

R\B 

The  final  form  of  the  approximate  functional  is 

Ep(u,v)  =  J  g)2 +  (l-vf\\Vu\\2  +  ^p\\Vv\\2  +  jj  ^dxdy 

R 


Ambrosio  and  Tortorelli  prove  that  Ep{u,  v)  converges  to  E(u,  B)  in  the  sense  of  T-convergence. 

The  corresponding  approximate  functional  when  dirichlet  boundary  conditions  are  imposed 
was  derived  and  it  is  as  follows  [Sh^]: 


Em,p(u,v)=  J 

R 


;{u  -  g )2  +  (1  -  v)2||  V«| 


dxdy 


+  l[{^  +  M(u-g)2}^p\\Vv\\2  +  j 
R 


(6) 


dxdy 
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where  M  is  of  order  Ojlogpj. 

It  should  be  noted  that  an  approximation  theory  for  functional  (1)  when  terms  containing 
curvature  and  singularities  of  B  are  introduced  has  yet  not  been  found. 

Stationary  solutions  for  functionals  (5)  and  (6)  may  be  easily  found  by  applying  gradient 
descent.  Thus,  applying  gradient  descent  to  (5),  we  get  the  following  coupled  diffusion  equations: 


Smoothing: 


Boundary  detection 


where  OR  denotes  the  boundary  of  R  and  n  denotes  the  direction  normal  to  OR. 

The  results  of  Ambrosio-Tortorelli  diffusion  (7)  when  applied  to  the  example  shown  in  Figure 
lb  with  a  -  p  =  8  pixels  are  shown  in  Figure  2.  The  image  in  Figure  lb  was  represented  on 
a  256x256  square  lattice.  The  thin  ellipse  has  the  maximum  width  of  9  pixels  and  the  distance 
between  the  two  ellipses  is  18  pixels.  Note  that  long  thin  objects  are  very  difficult  to  locate  in 
noisy  images  because  the  smoothing  radius  is  governed  by  the  noise  characteristics  and  thus  may 
be  much  larger  than  the  width  of  the  object.  Figure  2a  shows  the  smoothed  image  u  and  Figure  2b 
depicts  the  edge  strength  function  v.  The  lighter  the  area  in  the  figure,  the  higher  the  value  of  v. 


(b)  EDGE-STRENGTH  FUNCTION  v 


(a)  SMOOTHED  VERSION  u 


FIGURE  2 


3.  SHAPE  RECOVERY 


The  use  of  approximate  segmentation  functionals  discussed  above  for  constructing  practical 
algorithms  leads  to  a  new  difficulty,  namely,  the  difficulty  of  recovering  the  actual  boundaries 
from  the  edse-strength  function  v.  It  is  apparent  from  Figure  2b  that  thresholding  of  v  will  not 
produce  satisfactory  representation  of  the  boundaries.  The  trouble  is  that  due  to  differing  levels  of 
contrast  along  the  boundaries  and  the  surrounding  noise,  values  of  v  along  the  boundary  are  not 
constant.  That  is,  the  level  curves  of  v  are  only  approximately  parallel  to  the  object  boundaries; 
in  fact,  they  might  even  be  perpendicular  to  the  object  boundary  in  places.  Consequently,  global 
thresholding  of  v  produces  representation  of  the  boundaries  by  narrow  strips  of  varying  width 
which  may  not  completely  enclose  the  object  if  the  contrast  becomes  sufficiently  weak. 

3.1  Adaptive  Thresholding 

Schemes  for  adaptive  local  thresholding  of  v  are  discussed  in  [SI13].  The  basic  idea  is  that  the 
local  threshold  for  v  should  depend  on  the  average  value  v  in  the  neighborhood.  What  this  means 
is  that  we  should  look  at  the  laplacian  of  a  smoothed  version  of  v.  Ambrosio-Tortorelli  equations 
are  applied  to  v  to  produce  such  a  smoothing.  This  approach  to  local  thresholding  of  v  does 
produce  improvement,  but  such  further  processing  was  found  to  produce  further  displacement 
of  the  boundary  from  its  actual  location,  indicating  the  need  for  a  better  method  for  recovering 
the  actual  boundary  from  v. 

3.2  Shape  Recovery  by  Curve  Evolution 

The  basic  idea  is  borrowed  from  the  work  of  Kass,  Witkin  and  Terzopoulos  [KWT]  on 
SNAKES.  In  their  framework,  one  introduces  a  simple  closed  curve  in  the  image  and  lets  it  deform 
under  forces  which  are  determined  by  distance  from  the  object  boundary  and  by  a  smoothness 
constaint.  In  the  framework  presented  here,  the  analogous  idea  [Sh/]  is  to  let  an  initial  curve 
evolve  so  as  to  (locally)  maximize  the  edge-strength  function  v  along  the  curve  in  some  sense. 
Let  F  be  a  simple  closed  curve  in  R.  In  order  to  move  T  to  where  the  image  intensity  gradient 
and  hence  v  are  high,  we  look  for  the  stationary  points  of  the  functional 


r 


where  7  denotes  the  arc-length  along  T.  The  evolution  equation  for  T  is  derived  by  applying 
gradient  descent  to  L.  Let  C{p,t)  :  I  x  [0,oo)->  R  be  the  evolving  family  of  curves  where  I  is 
the  unit  interval  and  t  denotes  time.  We  require  that  C(0,  t)  =  C(l,t)  for  all  values  of  t  and  that 
the  image  of  C{p,  0)  in  R  coincides  with  T.  Then  the  evolution  of  T  is  governed  by  the  equation 

=  [aVv  •  iV  -  (1  -  v)k\N  (9) 

where  N  is  the  outward  normal  and  k  is  the  curvature  which  is  defined  such  that  it  is  positive 
when  T  is  a  circle.  Thus  the  points  on  the  evolving  curve  move  in  the  direction  of  the  normal 
with  velocity  which  has  two  components:  the  component  depending  on  curvature  which  imposes 
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a  smoothness  constraint  and  advection  induced  by  v.  The  curve  is  pulled  towards  the  object 
boundary  by  the  force  field  V  v  induced  by  v.  The  exponent  a  in  the  expression  for  L  serves  as 
a  weighting  factor.  The  higher  the  value  of  a,  the  weaker  the  smoothing  constraint. 

The  implementation  of  curve  evolution  is  no  longer  a  straightforward  matter  of  using  finite 
differences.  Moving  the  points  on  an  evolving  curve  directly  by  discretizing  the  curve  leads  to 
many  difficulties.  For  example,  the  chosen  points  might  bunch  up  causing  numerical  instabilities. 
The  curve  may  also  undergo  topological  changes.  An  alternate  method  is  the  one  proposed 
by  Osher  and  Sethian  [OS].  In  their  approach,  the  initial  curve  is  embedded  in  a  surface  as  a 
level  curve  and  then  evolution  is  applied  to  the  surface  so  that  all  of  its  level  curves  evolve 
simultaneously.  Assume  that  T  is  embedded  in  a  surface  /0  :  R  -^R  as  a  level  curve.  Let 
f(t.x.y)  denote  the  evolving  surface  such  that  f(0,x,y)  =  fo{x,y).  Then,  in  order  to  evolve 
all  the  level  curves  of  /o  simultaneously,  we  consider  the  functional 

CO 

G(f)=  I  I  (l-  vfd'jcdc  (10) 

-co  rc 

where  Tc  =  {(x,y)\f(t,x,y)  =  c}.  But  by  the  coarea  formula, 


For  numerical  implementation  of  equation  (11),  the  usual  finite-difference  schemes  (say, 
central  differences)  are  exactly  the  wrong  thing  to  apply  and  must  never  be  used.  The  main  point 
of  Osher  and  Sethian  is  that  since  we  expect  the  evolving  surface  to  develop  discontinuities  or 
’shocks’  where  the  object  boundaries  are,  the  directions  of  the  finite  differences  must  always 
be  away  from  the  shock;  a  finite-difference  must  never  straddle  a  shock.  Once  this  principle  is 
incorporated  in  the  numerical  scheme,  nonlinear  diffusion  (11)  behaves  very  robustly. 

An  important  question  now  is:  How  to  specify  the  intial  curve  T?  An  automatic  specification 
of  the  initial  curve  is  a  difficult  problem  because  its  solution  implies  that  the  object  boundaries 
have  already  been  found,  at  least  approximately.  The  strategy  used  in  [Shy]  for  specifying  the 
initial  curve  is  based  on  the  following  considerations.  If  we  set  v  =  1  and  /0  =  g  in  equation  (11), 
then  it  reduces  to  the  diffusion  equation  proposed  in  [ALM]  for  smoothing  images.  In  [ALM], 
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diffusion  is  allowed  to  occur  only  in  the  direction  of  the  level  curves,  the  expectation  being  that 
in  this  way,  the  object  boundaries  will  be  smoothed  without  being  blurred.  The  difficulty  with 
this  approach  is  of  course  that  the  object  boundaries  in  general  are  not  level  curves  in  an  image. 
Traditional  use  of  the  zero-crossings  of  the  laplacian  of  smoothed  images  to  detect  edges  may 
be  thought  of  as  an  attempt  to  remedy  this  situation  by  representing  object  boundaries  by  the 
level  curves  of  the  smoothed  laplacian  instead  of  the  level  curves  of  the  image  itself.  Hence,  the 
strategy  in  [Sh7]  is  to  use  the  zero-crossings  of  the  laplacian  as  the  initial  approximation  T  for 
the  object  boundaries  and  choose  /0  equal  to  the  laplacian  of  a  smoothed  version  of  the  image. 
Now,  if  the  raw  image  g  is  very  noisy,  the  laplacian  of  the  smoothed  image  u  obtained  from 
Ambrosio-Tortorelli  diffusion  will  also  be  very  noisy.  To  see  this,  consider  the  behavior  of  u 
in  the  limit  as  p  —  0.  In  the  limit,  u  is  a  stationary  solution  of  the  original  problem  (1)  and 
satisfies  the  differential  equation  V2u  =  (u  -  g)/a2,  indicating  that  the  laplacian  will  be  as  noisy 
as  the  original  image.  It  will  have  many  saddle  points,  indicated  by  many  self-intersections  of 
its  zero-crossings.  The  theory  of  curve  evolution  is  based  on  the  assumption  that  T  is  a  simple 
closed  curve,  an  assumption  which  is  violated  at  the  saddle  points  of  /.  If  the  laplacian  is  too 
noisy,  the  evolution  is  dominated  by  what  happens  at  its  saddle  points  and  becomes  unpredictable. 
This  is  because  at  a  saddle  point,  the  first  term  on  the  right  hand  side  of  the  evolution  equation 
(11)  vanishes  and  both  the  numerator  and  the  denominator  vanish  in  the  second  term.  Hence,  the 
behaviour  of  the  second  term  becomes  very  sensitive  to  noise  near  a  saddle  point.  In  order  to 
demonstrate  the  feasibility  of  this  approach,  an  ad  hoc  solution  to  this  problem  has  been  adopted 
in  [Sh7],  We  smooth  u  further  by  non-uniform  smoothing  as  described  below  until  the  zero- 
crossings  of  the  laplacian  have  relatively  few  self-intesections.  Recall  that  we  can  implement 
uniform  Gaussian  smoothing  of  u  by  applying  gradient  descent  to  the  functional 


1 1  \\Vw\\2dxdy  (14) 

R 

setting  iv  =  u  at  t  =  0.  In  order  to  reduce  the  displacement  of  the  significant  boundaries  (indicated 
by  high  values  of  v)  due  to  smoothing,  we  replace  the  above  functional  by  the  functional 


I  j(  1  -  v)Q\\ViLtfdxdy  (15) 

R 

The  corresponding  gradient  descent  equation  is 


dw 

dt 


-aVv  •  V-u;  +  (1  -  v)V2w 


— —  =  0  along  the  boundary  of  R 
an 


w  =  u  at  t  =  0 


(16) 


The  diffusion  by  equation  (15)  is  stopped  by  the  user  when  the  level  curves  of  V2tt?  have 
relatively  few  self-intesections.  Then  we  set  /0  =  V2w  and  let  it  evolve  according  to  equation 
(11).  The  zero-crossings  of  the  evolving  /  are  taken  as  the  successive  refinements  of  the  object 
boundaries. 
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Figures  3  and  4  show  the  results  of  evolution  when  applied  to  the  example  shown  in  Figure  lb. 
The  laplacian  of  a  was  so  noisy  that  almost  all  the  pixels  were  zero-crossings  of  the  laplacian. 
Hence,  u  was  further  smoothed  by  nonuniform  diffusion  (15)  and  then  evolution  by  equation 
(11)  was  applied  to  its  laplacian.  Figure  3  shows  the  zero-crossings  of  the  evolving  laplacian 
at  t  =  0.40.  SO.  160.320.640.1280.  Thus,  the  zero-crossings  at  t  =  0  are  the  zero-crossings  of 
the  laplacian  of  u  smoothed  by  equation  (15).  The  example  illustrates  that  the  boundaries  found 
by  this  method  are  at  least  metastable,  that  is,  they  persist  for  a  long  time.  Figure  4  shows  the 
superposition  of  the  zero-crossings  smoothed  by  evolution  (t  =  1280)  on  the  noisefree  image 
(Figure  4a)  as  well  as  on  the  smoothed  image  u  (Figure  4b).  Smoothed  image  u  was  used  for 
superposition  rather  than  g  because  the  evolution  is  governed  by  v  which,  in  turn,  is  determined 
by  the  boundaries  in  u.  Note  the  accuracy  of  placement  of  the  smoothed  zero-crossings  on  the 
noiseless  image,  including  the  thin  ellipse.  The  comers  are  also  fairly  well  represented.  The 
boundary  deviates  in  places  from  the  boundary  in  the  noiseless  image  when  it  follows  some 
accidental  feature  introduced  by  the  noise.  (It  is  possible  to  discern  these  features  by  close 
inspection  of  Figure  2a.)  It  is  interesting  that  although  the  four  comer  squares  are  identical  in 
the  noiseless  image,  the  final  boundaries  are  different  in  all  four  cases  because  the  accidental 
noise  features  are  different.  The  worst  deviation  occurs  in  the  case  of  the  lower  right  comer 
square.  The  very  thin  ends  of  the  thin  ellipse  are  lost  because  the  boundary  representation  v  is 
too  coarse  (p  -  8  pixels).  Some  portions  towards  the  ends  of  the  thin  ellipse  are  lost  because 
they  are  obscured  by  noise  as  one  can  see  in  Figure  2a. 
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(a)  SUPERPOSITION  ON  NOISE-FREE  IMAGE  (b)  SUPERPOSITION  ON  SMOOTHED  IMAGE  u 

FIGURE  4 

4.  DIFFUSION  SYSTEMS 

Use  of  curve  evolution  for  shape  recovery  as  described  above  exemplifies  an  approach  for 
getting  around  the  difficulties  in  finding  the  minimizing  solutions  for  energy  functionals.  The 
basic  ingredient  was  to  incorporate  the  edge-strength  function  v  in  the  algorithms  designed  to 
solve  the  next  level  of  tasks  in  computer  vision.  If  these  tasks  are  also  governed  by  evolution 
equations,  then  the  idea  is  to  let  this  evolution  be  controlled  by  v.  In  fully  integrated  models,  one 
would  try  to  construct  a  coupled  system  of  diffusion  equations  in  which  each  equation  controls 
one  particular  process  or  a  task  like  smoothing  or  edge  detection  or  matching  and  outputs  of  these 
equations  provide  controls  for  each  other’s  evolution.  The  diffusion  system  (7)  of  Ambrosio  and 
Tortorelli  may  be  viewed  as  a  system  of  this  type.  Other  examples  are  described  below.  The 
difficulty  in  designing  nonlinear  systems  of  diffusion  equations  for  problems  in  computer  vision 
is  that  such  systems  are  one  of  the  most  difficult  mathematical  objects  to  analyze  and  no  general 
mathematical  principles  are  available  to  guide  the  design. 

4.1  Stereo 

In  stereo  vision,  the  basic  problem  is  the  correspondence  problem,  that  is,  the  problem  of 
matching  the  two  views  seen  by  the  two  eyes  (or  cameras)  and  thus  computing  the  disparity 
between  the  two  images.  A  major  complication  is  the  presence  of  discontinuities.  Due  to  sudden 
changes  in  the  distance  of  the  objects  from  the  eyes,  there  is  a  sudden  jump  in  the  disparity. 
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Another  consequence  of  this  is  the  phenomenon  of  half-occlusions  which  are  the  areas  in  the 
scene  seen  by  one  eye  and  not  the  other.  Thus  it  is  necessary  to  prevent  matching  of  the  two 
images  in  areas  of  half-occlusions.  A  coupled  diffusion  system  of  4  equations  is  formulated  in 
[Shj]  as  follows: 

^  =  V  •  (1  -  WffVdp  +  !•(/,  -  Ir  o  °  //)  (1  -  o  f-1)2 

=  pv  •  vwt  -  —  +  -(1  -  u*)||V/<||2 

df  p  u  (17) 

~  =  V  •  (1  -  UV)2Vdr  +  ^(Ir  -  If.  O  A)(|f  0  /r)  (1  -  ^  O  f-1)2 

=  pV  •  Vuv  -  —  +  l-  uv)||V/r||2 

dt  p  v 

where 

df,  dr  are  the  disparities  as  seen  from  the  left  and  right  eyes  respectively; 

I,:.Ir  are  the  left  and  right  images; 

X(,  xT  are  the  coordinates  in  the  left  and  right  images  parallel  to  the  epipolar  lines; 
ft  —  x  f  A  df ,  f r  —  3:  r  T  dy » 

W(,  uy  represent  the  discontinuity  locus  of  disparities  dp .  dr  respectively. 

The  first  and  third  equations  are  the  stereo  matching  processes  from  the  left  and  right  eyes 
respectively,  while  the  second  and  the  fourth  equations  are  discontinuity  detection  processes. 
The  formulation  is  illustrated  by  examples  in  [Shs] 


4.2  Shape-from-Shading 


The  problem  here  is  to  solve  the  irradiance  equation  to  recover  the  surface  shape  from  the 
shading  data.  As  usual,  one  has  to  account  for  noise  and  also  discontinuities  in  surface  shape 
in  the  form  of  creases.  In  [SHG],  a  formulation  is  derived  for  the  shape-from-shading  problem, 
taking  into  account  noise  and  discontinuities,  and  also  allowing  fusion  with  data  from  other 
sources  such  as  range  images.  The  diffusion  system  is  given  as  follows: 


I Uv.v  +  ^-,) 

o? 

where  vector  V  =  —  (R  -  /)V jxjvR  - 


V  •  (1  -  s)2V fx,  V  •  (1  -  s)2V fy 


(18) 


ds 

dt 


=  V  •  Vs - T  + 


2A4 

up 


(ll/tll2  +  ll/vl|2)(l  -  ») 


Here, 


/  represents  the  surface  to  be  recovered; 
d  is  the  range  data; 

/  is  the  intensity  image; 

R  is  the  reflectance  function  defined  in  terms  of  the  derivatives  /x,/y; 
V/xi/y  is  the  gradient  operator  with  respect  to  variables  /r,/y; 
s  represents  the  locus  of  creases  in  the  surface  /. 
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The  first  equation  fits  the  surface  to  the  range  data  and  to  the  intensity  image  though  the  reflectance 
map.  The  surface  is  regularized  by  a  second  order  smoothing  constraint.  The  last  equation  detects 
discontinuities  in  the  gradient  of  /  and  restricts  smoothing  of  /  wherever  creases  occur. 


4.3  Nonlinear  Smoothing 

Alternate  systems  of  diffusion  equations,  coupling  a  smoothing  process  and  a  boundary 
detection  process  in  a  manner  analogous  to  the  Ambrosio-Tortorelli  system  (7)  are  described  in 
[Sh3]  and  [PPGO]. 

4.4  Optical  Flow 

The  framework  of  coupled  diffusion  systems  developed  in  this  research  project  has  been 
used  by  Proesmans  et  al  [PGOj  to  construct  a  system  of  six  diffusion  equations  for  calculating 
motions  of  objects  moving  at  different  speeds  and  possibly  occluding  each  other. 


13 


BIBLIOGRAPHY 


[ALM]  L.  Alvarez,  P.L.  Lions  and  J.M.  Morel:  ’’Image  Selective  Smoothing  and  Edge  Detection  by 
Nonlinear  Diffusion  II”,  SIAM  J.Num.Anal,  (June, 1992). 

[ATj,]  L.  Ambrosio  and  V.M.  Tortorelli:  ’’Approximation  of  Functionals  Depending  on  Jumps  by 
Elliptic  Functionals  via  T -convergence”,  Preprint,  Scoula  Normale  Superiore,  Pisa,  (1988) 
[AT? ]  L.  Ambrosio  and  V.M.  Tortorelli:  ”On  the  Approximation  of  Functionals  depending  on 
Jumps  by  Quadratic,  Elliptic  Functionals”,  Boll.  Un.  Mat.  Ital.  (1992). 

[BDP]  G.  Bellettini,  G.  Dal  Maso  and  M.  Paolini:  ’’Semicontinuity  and  Relaxation  Properties  of  a 
Curvature  depending  Functional  in  2d”,  SISSA  Report,  17/92/MA,  (Feb.  1992). 

[BZi  ]  A.  Blake  and  A.  Zisserman:  ’’Using  weak  continuity  constraints”,  Report  CSR- 186-85,  Dept. 

of  Comp.  Sci.,  Edinburgh  Univ.  (1985) 

[BZi]  A.  Blake  and  A.  Zisserman:  Visual  Reconstruction,  M.I.T.  Press  (1987) 

[DMS]  G.  Dal  Maso,  J.M.  Morel  and  S.  Solimini:  ”A  Variational  Method  in  Image  Segmentation: 
Existence  and  Approximation  Results”,  Preprint,  CEREMADE,  Universite  Paris  -  Dauphine, 
Paris  (1988) 

[DCL]  E.  De  Giorgi,  M.  Carriero  and  A.  Leaci:  ’’Existence  Theorem  for  a  Minimum  Problem  with 
Free  Discontinuity  Set”,  Preprint,  University  of  Lecce,  Italy  (1988) 

[GeGe]  S.  Geman  and  D.  Geman:  ’’Stochastic  relaxation,  Gibbs’  distributions,  and  the  Bayesian 
restoration  of  images”,  IEEE  Trans.,  PAMI  6,  pp.  721-741  (1984) 

[KWT]  M.  Kass,  A.  Witkin  and  D.  Terzopoulos:  ’’Snakes:  Active  Contour  Models”,  First  Interna¬ 
tional  Conf.  on  Computer  Vision,  (1987). 

[M]  D.  Mumford:  ’’Elastica  and  Computer  Vision”,  Preprint. 

[MSi]  D.  Mumford  and  J.  Shah:  ’’Boundary  detection  by  minimizing  functionals,  I”,  Proc.  IEEE 
Conf.  on  Computer  Vision  and  Pattern  Recognition,  San  Francisco  (1985) 

[MSo]  D.  Mumford  and  J.  Shah:  ’’Optimal  approximations  by  piecewise  smooth  functions  and 
associated  variational  problems”,  Comm,  on  Pure  and  Appl.  Math.,  v.  XLII,  n.5,  pp.577- 
684  (July,  1989) 

[NM]  M.  Nitzberg  and  D.  Mumford:  ”The  2. ID  Sketch”,  Third  Int’l  Conf.  on  Comp.  Vision, 
(December,  1990). 

[OS]  S.  Osher  and  J.  Sethian:  ’’Fronts  Propagating  with  Curvature  Dependent  Speed:  Algorithms 
based  on  the  Hamilton-Jacobi  Formulation”,  J.  Comp.  Physics,  79,  (1988). 

[PBCFM]  S.M.  Pizer,  C.A.  Burbeck,  J.M.  Coggins,  D.S.  Fritsch,  B.S.  Morse:  ’’Object  Shape  before 
Boundary  Shape:  Scale-Space  Medial  Axes”,  J.  of  Math.  Imaging  and  Vision,  (1993). 
[PPGO]  M.  Proesmans,  E.J.  Pauwels,  L.J.  Van  Gool  and  A.  Oosterlinck:  ’’Image  Enhancement  using 
Non-Linear  Diffusion”,  Proc.  IEEE  Conf.  on  Comp.  Vision  and  Pattern  Recognition,  (June, 
1993). 

[PGO]  M.  Proesmans,  L.J.  Van  Gool  and  A.  Oosterlinck:  ’’Determination  of  Optical  Flow  and  its 
Discontinuities  using  Non-linear  Diffusion”,  Preprint,  ESAT-MI2,  Katholieke  Univ.  Leuven, 
Leuven,  Belgium,  ((1993). 

[Ri]  T.  Richardson:  Ph.D.  Thesis,  Department  of  Electrical  Engineering  and  Computer  Science, 
MIT  (1990) 

[Shi]  J.  Shah:  ’’Parameter  Estimation,  Multiscale  Representation  and  Algorithms  for  Energy- 
Minimizing  Segmentations”,  Tenth  International  Conference  on  Pattern  Recognition,  (June, 


14 


1990) 

[ S h . ]  J.  Shah:  "Segmentation  by  Nonlinear  Diffusion”,  Proc.  IEEE  Conf.  on  Comp.  Vision  and 
Pattern  Recognition,  (June,  1991). 

[Sh:i]  J.  Shah:  "Segmentation  by  Nonlinear  Diffusion,  II”,  Proc.  IEEE  Conf.  on  Comp.  Vision 
and  Pattern  Recognition,  (June,  1992). 

[Sh[]  J  c.hah:  "Properties  of  Energy-Minimizing  Segmentations”,  SIAM  J.  on  Control  and  Optim. 
v.30,  no.l,  99-111,  (1992). 

[Sh,-]  J.  Shah:  “A  Nonlinear  Diffusion  Model  for  Discontinuous  Disparity  and  Half-Occlusions  in 
Stereo”,  Proc.  IEEE  Conf.  on  Comp.  Vision  and  Pattern  Recognition,  (June,  1993). 

[Sh,’]  J.  Shah:  “Piecewise  Smooth  Approximations  of  Functions”,  Calculus  of  Variations  and  Partial 
Differential  Equations,  2,  pp.315-328,  (1994). 

[Sh-]  J.  Shah:  ’’Recovery  of  Shapes  by  Evolution  of  Zero-Crossings”,  submitted  for  presentation 
at  the  Fifth  International  Conference  on  Computer  Vision,  (1995). 

[SHG]  J.  Shah:  “Role  of  Fusion  and  Regularization  in  the  Shape-from-Shading  Problem”,  submitted 
to  Proc.  IEEE  Conf.  on  Comp.  Vision  and  Pattern  Recognition,  (June,  1994). 

[SK]  K.  Siddiqi  and  B.B.  Kimia:  ’’Parts  of  Visual  Form:  Computational  Aspects”,  IEEE  Conf. 
on  Vision  and  Pattern  Recognition,  (1993). 

[STK]  K.  Siddiqi,  K.  Tresness  and  B.B.  Kimia:  ’’Parts  of  Visual  Form:  Ecological  and  Psychological 
Aspects”,  Technical  Report  LEMS-104,  Div.  of  Engineering,  Brown  University,  (Feb.  1994). 

[Wa]  Y.  Wang:  Ph.D.  Thesis,  Dept,  of  Math,  Harvard  Univ,  (1989). 

[W]  Y.  Wen:  ”L2  Flow  of  Curve  Straightening  in  the  Plane”,  Duke  Math.  J.  v.70,  n.3,  pp.683- 
398,  (June,  1993). 


PUBLICATIONS 


1.  Recovery-  of  Shapes  by  Evolution  of  Zero-Crossings,  Submitted  to  International  Conference 
on  Computer  Vision,  to  be  held  in  May,  1995. 

2.  Recovery  of  Shapes  of  Surfaces  with  Discontinuities  by  fusing  Shading  and  Range  Data  within 
a  Variational  Framework,  (with  H.  Pien  and  J.  Gauch),  Submitted  to  IEEE  Proceedings  on 
Image  Processing. 

3.  Piecewise  Smooth  Approximations  of  Functions:  Calculus  of  Variations  and  Partial  Differ¬ 
ential  Equations,  v.2,  pp. 3 15-328  (1994). 

4.  A  Nonlinear  Diffusion  Model  for  Discontinuous  Disparity  and  Half-Occlusions  in  Stereo: 
IEEE  Conference  on  Computer  Vision  and  Pattern  Recognition,  New  York,  N.Y.,  June,  1993. 

5.  Segmentation  by  Nonlinear  Diffusion,  II:  Proc.  of  the  IEEE  Conference  on  Computer  Vision 
and  Pattern  Recognition, Champaign,  IL,  June,  1992. 

6.  Segmentation  by  Nonlinear  Diffusion  Proc.  of  the  IEEE  Conference  on  Computer  Vision  and 
Pattern  Recognition, Lahaina,  Hawaii,  June,  1991. 


SCIENTIFIC  PERSONNEL:  Jayant  Shah 


16 


